The "Corpus of Interactional Data" (CID) - Multimodal annotation of conversational speech"
نویسندگان
چکیده
The understanding of language mechanisms needs to take into account very precisely the interaction between all the different domains or modalities, which implies the constitution and the development of resources. We describe here the CID (Corpus of Interactional Data), an audio-video corpus in French recorded and processed at the Laboratoire Parole et Langage (LPL). The corpus has been annotated in a multimodal perspective including phonetics, prosody, morphology, syntax, discourse and gesture studies. The first results of our studies on the CID lead to confirm the relevance of an analysis which takes into account as many linguistic fields as possible to draw up a more precise knowledge of discourse phenomena. MOTS-CLÉS : schéma d'encodage multimodal, outils et plate-forme d'annotation, phonétique, prosodie, morphologie, syntaxe, discours, geste.
منابع مشابه
The OTIM Formal Annotation Model: A Preliminary Step before Annotation Scheme
Large annotation projects, typically those addressing the question of multimodal annotation in which many different kinds of informationhave to be encoded, have to elaborate precise and high level annotation schemes. Doing this requires first to define the structure of theinformation: the different objects and their organization. This stage has to be as much independent as possible ...
متن کاملNaïve listeners’ perception of prominence and boundary in French spontaneous speech
Our main goal here is to explore the link between naïve listeners’ perception of prominences and boundaries in spontaneous speech and experts’ annotation of prosodic hierarchy and accentuation in French. We first present the design of our corpus, which consists in 133 utterances extracted from the Corpus of Interactional Data (CID). 73 naïve listeners judged prominences and boundaries using thr...
متن کاملA quantitative view of feedback lexical markers in conversational French
This paper presents a quantitative description of the lexical items used for linguistic feedback in the Corpus of Interactional Data (CID). The paper includes the raw figures for feedback lexical item as well as more detailed figures concerning interindividual variability. This effort is a first step before a broader analysis including more discourse situations and featuring communicative funct...
متن کاملAutomatic analysis of multiparty meetings
This paper is about the recognition and interpretation of multiparty meetings captured as audio, video and other signals. This is a challenging task since the meetings consist of spontaneous and conversational interactions between a number of participants: it is a multimodal, multiparty, multistream problem. We discuss the capture and annotation of the AMI meeting corpus, the development of a m...
متن کاملAutomatic detection of other-repetition occurrences: application to French conversational Speech
This paper investigates the discursive phenomenon called other-repetitions (OR), particularly in the context of spontaneous French dialogues. It focuses on their automatic detection and characterization. A method is proposed to retrieve automatically OR: this detection is based on rules that are applied on the lexical material only. This automatic detection process has been used to label other-...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- TAL
دوره 49 شماره
صفحات -
تاریخ انتشار 2008